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perceptions and experiences with such systems is essential for enhancing their 
effectiveness and acceptance in educational practice. 

Aims: This pilot study aimed to examine the impact of STACK (System for 
Teaching and Assessment using a Computer algebra Kernel)-based online exams 
on various facets such as student comfort, perceived value for learning and exam 
preparation, preferences between pen-and-paper versus STACK formats, 
confidence in online math tests, readiness to adopt online assessment platforms, 
and overall perspectives on online evaluations. 

Method: An experimental within-subjects design and convenience sampling 
were used to involve 117 first-year biology students enrolled in a Probability and 
Statistics course, who were already familiar with the STACK system. Data were 
collected using pre- and post-exam online surveys featuring five-point Likert 
scale questions and an open-ended query. 

Results: The findings indicate that students felt more comfortable using the 
STACK system after the study and preferred it over traditional exams. 
Nevertheless, some students expressed uncertainty about using STACK content 
for final exams due to concerns about its effectiveness in evaluating critical 
thinking and potential technical difficulties. However, concerns regarding 
technical challenges decreased significantly post-exam, with no technical issues 
reported during the exam. Positive feedback highlighted STACK's benefits for 
formative assessment, easier learning, immediate feedback, and its practicality 
and innovation. Some students even suggested incorporating STACK into final 
exams for convenience and advocated for further investment in the STACK 
system, possibly with improved content. 

Conclusion: In summary, students preferred using STACK for exams, though 
concerns about technical glitches and the need to refine content for assessing 
critical thinking persist. Future efforts should focus on enhancing content quality, 
starting this summer. 
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4(1), 25-40. 


INTRODUCTION 


Gagné's theory of learning, which encompasses nine instructional events, highlights the critical 
importance of assessment, positioning it as fundamental to the learning process (Gagne, 1985). This 
framework posits that effective assessment is crucial at various stages of learning to reinforce 
knowledge acquisition and ensure long-term retention. Consequently, this has led to considerable 
discussion on how to optimize learning through assessment (Li & Schoenfeld, 2019). Recently, there 
has been a significant shift towards the adoption of technology-driven assessment methods, driven 
by the recognition that traditional paper-based assessments may not fully exploit the potential of 
dynamic and interactive methods enabled by technology (Cézar et al., 2021). As a result, educators 
are increasingly exploring innovative assessment tools and platforms to enhance the learning 
experience and foster deeper engagement among students (Hsu & Wu, 2023; Tursyngozhayev et al., 


2024). 
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Building on this shift towards technology-driven assessment methods, the COVID-19 pandemic 
has profoundly impacted education worldwide, sparking discussions about the necessity to 
restructure assessment methods. The abrupt transition to remote and hybrid learning models 
necessitated by the pandemic compelled educators to rethink traditional assessment approaches. As 
Zhao and Watterston (2021) have highlighted, the challenges brought on by the pandemic have 
accelerated these discussions, leading to an increased awareness of the limitations inherent in 
traditional assessment methods within remote learning environments. Educators and researchers 
have recognized the need for more adaptable, technology-enabled assessment strategies that can 
effectively support remote and online learning modalities. In response to these challenges, 
educational institutions have increasingly adopted online assessment tools to facilitate remote 
learning and assessment. The use of online assessment tools has grown significantly in recent years, 
as evidenced by studies from Csap6 and Molnar (2019), Garcia-Pefialvo et al. (2021), Juma (2023), 
and Zerva (2020). These studies have documented the widespread implementation of online 
assessment platforms across various educational contexts, from primary schools to higher education 
institutions. The growth of online assessment tools can be attributed to several factors, including the 
need for greater flexibility, time-saving assessment methods, and advancements in technology that 
have facilitated remote learning. Additionally, the aim to provide students with a more interactive 
and engaging educational experience has driven the adoption of online assessment tools, as 
highlighted by Serutla et al. (2024). 

In line with the shift towards technology-driven assessment methods, a notable example of an 
online assessment tool is STACK (Systems for Teaching and Assessment using Computer Algebra 
Kernel), which stands out as a remarkable tool tailored for STEM subjects (Sangwin, 2015). With its 
open-source plug-in accessible through platforms like Moodle, STACK revolutionizes teaching, 
learning, and assessment practices across STEM disciplines. Powered by the open-source computer 
algebra system, Maxima, STACK empowers educators to create dynamically generated mathematical 
questions within structured templates (Sangwin, 2015). This feature not only enhances the efficiency 
and systematization of teaching but also fosters personalized learning experiences for students. 
Furthermore, STACK's structured models rigorously evaluate student responses, providing targeted 
feedback tailored to the specific types of errors encountered (Moodle Plugins Directory, 2024). 
Importantly, STACK's impact extends beyond its technical functionalities to its real-world 
applications in diverse educational settings. For instance, researchers at Maseno University, Kenya 
(Juma, 2023), conducted a study on the effectiveness of STACK on learner engagement, performance, 
and perception in mathematics. Their findings underscored the positive correlation between STACK 
scores and end-of-semester exam results, highlighting its effectiveness in enhancing student learning 
outcomes. Additionally, the University of Birmingham (Sangwin & Hermans, 2013) reported on the 
use of STACK in mathematics education, emphasizing its role in addressing criticisms of traditional 
assessment methods and fostering valid assessment practices. Similarly, in Finland, researchers 
explored the correlation between STACK performance and exam grades, revealing insights into 
student behavior and learning patterns. This study demonstrated the positive impact of STACK on 
student engagement and problem-solving skills in mathematics education (Makela et al., 2016). 
Furthermore, the development of Portable STACK in Japan (Nakamura et al., 2013), showcased 
STACK's adaptability and accessibility, providing students with convenient access to mathematics 
exercises and promoting independent learning. These examples highlight STACK's versatility and 
efficacy in various educational contexts, from Africa to Europe and Asia, reinforcing its significance 
as a transformative tool for teaching, learning, and assessment in STEM disciplines. 

As education continues to evolve with technological integration and pedagogical 
advancements, understanding learner experiences with these tools becomes paramount. According 
to Olasina (2023), piloting these tools in various contexts is essential to provide insights into their 
effectiveness, tailor their integration to meet diverse student needs, and contribute to the ongoing 
evolution of pedagogical practices. This approach aids in the development of evidence-based 
strategies for the use of technology in education. Furthermore, students are not passive recipients of 
education but active participants whose attitudes are shaped either by pedagogy or through the 
general perception of the subject as a whole (Acosta-Gonzaga & Walet, 2018; Han & Liou-Mark, 
2023). This understanding underscores the importance of continually assessing and adapting 
educational tools to enhance learning experiences and outcomes. As such, incorporating student 
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feedback and experiences into the development and refinement of technological tools is crucial for 
ensuring their effectiveness and acceptance in educational practice. 

Many research studies on student experiences with online examinations utilizing technological 
tools have revealed a spectrum of outcomes, encompassing both positive and negative aspects. Tai 
et al. (2022), employing the Technology Acceptance Model (TAM) in a quantitative survey, identified 
the benefits of online assessments, particularly accentuated during the COVID-19 pandemic. Their 
research established significant correlations between perceived ease of use, usefulness, and various 
advantages linked with online assessments, including enhanced accessibility, streamlined 
administration processes, and the provision of prompt feedback. Moreover, the reduced anxiety 
reported by students can be attributed to several factors, supported by theories such as Lazarus and 
Folkman's Transactional Theory of Stress and Coping (Krohne, 2002). Online assessments provide a 
less socially intimidating environment, reducing anxiety associated with peer and authority figure 
presence. Additionally, the autonomy offered by online assessments aligns with Deci and Ryan's Self- 
Determination Theory (Vallerand, 2000), fulfilling students’ psychological needs for autonomy and 
competence, thereby reducing anxiety levels. These findings resonate with similar conclusions from 
Butler-Henderson & Crawford's (2020) systematic review, which examined student perceptions, 
performance, and anxiety related to online examinations. Similarly, Raman et al. (2021) reviewed the 
types, architecture, challenges, and prospects of online proctored examinations. Their work, focusing 
on the student adoption experience, provides insights into the effectiveness of online examinations. 
The advantages reported by students, such as preventing time loss, reducing exam anxiety, and 
quickly learning exam results, align closely with the benefits highlighted in this study, strengthening 
the foundation for understanding the multifaceted nature of student experiences with online 
assessments (Raman et al., 2021). 

The impact of gender on online assessment in STEM education is a complex and multifaceted 
issue (Idrizi et al., 2023). For instance, a correlational study found that female students outperformed 
males in online STEM courses, a discipline that has traditionally been male-dominated, highlighting 
the individualistic nature of engagement in such courses and making gender-based generalizations 
and assessments of online tools in STEM complex (Idrizi et al., 2023). Another study highlighted the 
potential of the anonymity and flexibility offered by online learning environments to attract more 
females to STEM studies, thereby mitigating gender disparities (Wladis et al., 2015; Wood et al., 
2021). Conversely, some research suggests that males might possess advantages in online 
classrooms, attributed to their higher perceived ability, comfort, and engagement with technology 
(Korlat et al., 2021). This continuous shift in findings from the literature emphasizes the importance 
of continued investigation and refinement for understanding in this domain. In this pilot study, sex 
differences were examined to identify the impact of using STACK content in exams on the two groups, 
aiming to shed light on the complex dynamics identified in existing literature. The pilot course under 
investigation comprised 85 (73.5%) females and 31 (26.5%) males. Recognizing this imbalance as a 
potential limitation in interpreting the analysis, this study highlights it as a focal point for further 
exploration in future studies. In critiquing the existing landscape of research on student experiences 
in online examinations, several key considerations emerge. Firstly, concerns about the reliability of 
online assessments have been raised, with factors such as technical glitches, internet connectivity 
issues, the potential for cheating, and the ability to assess critical thinking skills beyond 
computational abilities casting doubt on the credibility of these evaluations (Beliauskene & 
Yanuschik, 2021; Juma, 2023). 

Despite the numerous studies on online assessments, the prevailing positive results in student 
perceptions are often skewed towards non-STEM fields, revealing a notable gap in understanding 
how students in science, technology, engineering, and mathematics respond to online assessment 
tools (Chen et al., 2018; Meng et al., 2014). Amid this gap, STACK emerges as a focal point for 
investigation due to its emphasis on STEM subjects, integration of computer algebra systems, and 
potential impact on pedagogy in diverse contexts such as the Italian education system. This study 
was specifically designed to investigate learner perceptions of the use of STACK Technology for 
assessment in exams, extending beyond regular formative assessments. The University of Trieste, 
Italy, intends to integrate the STACK system in undergraduate STEM courses, making it essential to 
understand how students perceive and adapt to digital assessment tools like STACK. By tailoring 
educational practices to the local context and addressing any specific challenges, this research aims 
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to contribute to the broader understanding and effective implementation of STACK in diverse 
educational settings. 


The Context: University of Trieste 

The University of Trieste (see Home Page | Universita Degli Studi Di Trieste, n.d.) is a public 
research university in northeast Italy, consisting of 10 departments, about 15,000 students and 1,000 
staff. The University was founded in 1924 and celebrates this year its first century anniversary. 
Initiatives in STEM education, with special attention to digital didactic, are a pivotal aspect of UniTS's 
commitment to staying at the forefront of educational practices. In this outlook, the Department of 
Mathematics, Informatics and Geosciences (MIGe) of UniTS plays a crucial role at UniTS. MIGe 
incorporates the former Departments of Informatics and of Mathematics and Geosciences - 
recognized, the latter, by the Italian Ministry of Education as a Department of Excellence. MIGe stands 
out for its approach, addressed to an international and highly qualified target: in fact, its didactic offer 
is taught mostly in English (66% of Bachelor degree courses and 80% of Master degree courses are 
taught in English) and talented students are recruited from all over the world, also thanks to the 
support of University College “Luciano Fonda”, which every year offers accommodation and 
scholarships to MIGe talents with fewer opportunities. 

Through the years, MIGe set out a relevant cooperation with national universities (e.g. Master’s 
Degree program in Scientific and Data-Intensive Computing is organized with the University of 
Udine) and international HEIs (MIGe releases two double degrees: one Master degree in Geophysics 
and Geodata with the Institute de Physique du Globe from Paris and one master degree in 
Mathematics with the University of Ljubljana). Peculiar is the MIGe connection with the most 
important national (e.g. National Institute of Oceanography and Applied Geophysics - OGS, Area 
Science Park) and international research centres (e.g. International School for Advanced Studies - 
SISSA, ICTP) and industries, which give constant inputs and feedbacks on the current and 
forthcoming scientific, research and market needs. On the basis of this cooperation, MIGe updates 
and renews its didactic offer punctually (the international Master's Degree program in Data Science 
and Artificial Intelligence and the Master's Degree program in Scientific and Data-Intensive 
Computing started in 2023), normally with an interdisciplinary approach (an intense cooperation is 
set with the Departments of Physics, Engineering and Architecture), as an essential element to offer 
concrete solutions to existing issues. 

In the pursuit of modernizing assessment methodologies, the Department of Mathematics, 
Informatics and Geological Sciences strategically decided to start four pilot courses with the STACK 
system in 2023/24: (1) Probability and Statistics within the BSc of Biotechnology, (2) Linear Algebra 
taught in one unified course for Mathematicians and Physicists, (3) Linear Algebra in one unified 
course for Civil Engineers and Environmental Engineers, (4) Linear Algebra in one unified course for 
Naval Engineers and Mechanical Engineers. In these courses, STACK educational material was made 
available and employed to support students for additional exercises for self-assessment and extra 
feedback. In courses (1) and (3), STACK has been employed for continuous assessment, the grading 
counting in a relatively small percentage on the final grade (10% in course 1. and 7% in course 3.). 
Furthermore, in course (1), the STACK system’s ability to manage online mathematics exams was 
tested. The adoption of the STACK technology at UniTS aligns well with the university's strategic 
decisions towards the national objectives of modernising higher education within the Piano 
Nazionale di Ripresa e Resilienza (National Plan of Recovery and Resilience, PNRR). PNRR has, is, and 
still will provide a considerable amount of funding to projects aligning with its objectives. For 
example, two PhD positions at UniTS were funded in 2023/24 within the framework of improvement 
of the public administration, for projects specifically involving STACK, its impact and development, 
also in relation with machine learning techniques. UniTS MIGe Department initiated the use of STACK 
in the Italian environment, and might further develop its internal use as well as provide support and 
feedback to other National institutions that will decide to spearmint its use within their courses. MIGe 
also leads since years several outreach activities in high schools as well as activities of public 
engagement for the general public. The Department furthermore collaborates with the Accademia 
Nazionale dei Lincei, one of the oldest Science Academies of Europe established in Rome, organising 
sponsored training programmes for high school science teachers at the National level. 
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METHOD 


Experiment Overview 

The experiment comprised two phases: a traditional written exam followed by an online 
STACK exam. This section provides an overview of the experimental procedure. Figure 1 illustrates 
the flowchart outlining the process. 


Exam was divided into two parts: ; 
ae : Written Exam 
Traditional written Section A 
Online STACK Cone —_ 


7 Pre-Survey 
Questionnaire 


/ 


Data Analysis. 


Compare the change in Post-Survey STACK Exam 
attitudes Questionnaire — (Section B) 


Figure 1. Experiment overview flowchart 


The subsequent sections that follow describes in depth the elements that made up this research 
methodology. 


Research Design 

This research was an experimental within-subjects design. In a within-subjects design, all 
participants experience all levels of the independent variable or conditions of the study and the 
outcomes are compared between the conditions (Charness et al., 2012). The rationale for choosing 
this design was to assess changes in attitudes over time within the same group. 


Participant Demography 

This study involved 117 students from the Department of Life Science, specifically 
Biotechnology undergraduates at the University of Trieste (UniTS). In particular, there were 85 
(73.5%) females and 31 (26.5%) males (Figure 2). 


Figure 2. Participant demography in the course. 


The students involved in this study had prior exposure to STACK content during their practice 
assessment exercises in a course which spanned 40 hours over 6 weeks. Out of the 117, 115 (98.3%) 
students responded to the online questionnaire which had ten items rated ona five-point Likert scale 
and 2 open-text comments. Two students did not provide any response. 
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Participant Response Rate 

The students involved in this study had prior exposure to STACK content during their practice 
assessment exercises in a course which spanned 40 hours over 6 weeks. Out of the 117, 115 (98.3%) 
students responded to the online questionnaire which had ten items rated on a five-point Likert scale 
and 2 open-text comments. Two students did not provide any response. 


Exam Structure 

The course of general mathematics was subdivided into two sections: starting with 32 hours 
of calculus (derivation, integration, sequences and analysis of functions’ behaviour), the course then 
delved into 40 hours (distributed within 6 weeks) of Statistics and Probability. The examination 
process began with a pen-and-paper exam covering the calculus section. The students then handed 
in the analysis exam for regular human-led assessment and accessed UniTS Moodle via either a 
laptop or a tablet. Personal smartphones were not allowed, except to generate a hotspot in the rare 
cases when the personal device does not connect easily to the eduroam wifi network of the 
University. 

The students then access the STACK Exam via Moodle by inserting a password that was shared 
at the board by the lecturer the very morning of the exam. The exam was also only accessible during 
the morning of the exam, was timed to stay open for 100 minutes after opening, and was set up to 
accept automatically pending open attempts which were not handed in before 100 minutes. This 
measure was adopted to make sure that if the button to hand in the exam is frozen (which indeed 
happened in rare cases) the exam was not lost. It could also happen that the student was so focused 
on the exam that forgets to hand in on time: with this measure, all the solved problems were still 
submitted for evaluation. 


Data Collection 

At the commencement and conclusion of the exam, students were requested to participate ina 
brief survey comprising five multiple-choice questions, evaluated on a 5-point Likert scale for 
students to express their opinions on various aspects of the examination experience, such as comfort, 
confidence, and receptiveness to technological adaptation. The Likert scale allowed for the 
quantification of responses given, enabling us to analyze and compare the responses efficiently 
(Bertram, 2007). Additionally, a final open-ended question solicited comments (refer to Appendix I). 
These questions gauged levels of comfort, confidence, receptiveness to technological adaptation, and 
overall attitudes toward online examinations. 


Data Analysis 

Data analysis was done using SPSS version 20 for quantitative analysis and Google Spreadsheet 
for qualitative data. Descriptive statistics, such as the mean (M) and standard deviation (SD), offered 
insights into the average participant response and the variability in opinions. Inferential statistics, 
such as paired t-tests were calculated with the degrees of freedom (df) also computed to assess the 
statistical significance of changes before and after usage of the STACK system. 

The qualitative responses from students were analyzed using an inductive coding approach 
(Chandra & Shang, 2019). This involved breaking down the responses into smaller samples, 
developing codes to cover each sample, and then applying these codes to the data. The process was 
iterative, with new codes being developed based on the data and existing codes being reviewed and 
potentially revised. This approach was done to capture students’ sentiments both before and after 
students took the online exam in STACK, i.e. in the pre-post survey. In presenting the qualitative 
analysis, the average of pre-post responses was computed to rank the themes in order of magnitude. 
To guarantee the accuracy of the coding system, all the authors independently reviewed students' 
answers and the discrepancies that emerged were discussed through research meetings. 
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RESULTS AND DISCUSSION 


Results 

In this section, we present the findings of the study, focusing on the evolution of students’ 
attitudes and perceptions regarding online examinations following their engagement with the STACK 
system. Table 1 illustrates the paired sample statistics for the questionnaire items analyzed in the 
study. 


Table 1. Paired Samples Statistics 


M sD t df Sig. (2-tailed) 


95 % Confidence 
Comfort with Online Exams (Pre-STACK Exam) 310 892 
Pair 1 (1)Very Engornionide ... (5) Very Comfortable 606. 414 488 
Comfort with Online Exams (Post-STACK Exam) 316 951 
(1)Very Uncomfortable ... (5) Very Comfortable ; : 
Preference for Exam Format (Paper vs. STACK) 242 908 
Pair 2 () Serougy, Prefer Paper... (5) Strongly Prefer Online 3580 414 001 
Shift in Preference for Exam Format (Paper vs. STACK) 271 1.058 
(1)Shifted Toward Paper..(5) Strongly Shifted Toward Online : : 
Confidence in Online Mathematics Exam (Pre-STACK Exam) 306 .798 
Pair 3 (1) Bat confuent aeal (5) Extremely Confident 4.985 114 050 
Confidence in Online Mathematics Exam (Post-STACK Exam) 325 804 
(1) Decreased Significantly...(5) Increased Significantly : : 
Willingness to Adapt to STACK Exams (Pre-STACK Exam) 367 824 
Pair 4 i) Mee Willing at All ...(5) Very Willing 2573 114 oul 
Willingness to Adapt to STACK Exams (Post-STACK Exam) 345 881 
(1) Not Willing at All ...(5) Very Willing , ’ 
Overall Attitude Toward Online Exams (Pre-STACK Exam) 327 732 
; (1) Very Negative ... (5) Very Positive , , 
Pair 5 : - -568 112 571 
Overall Attitude Toward Online Exams (Post-STACK Exam) 331 .791 


(1) Very Negative ... (5) Very Positive 


Students’ comfort with Online Exams showed a marginal increase post-STACK (Mpost = 3.16) 
compared to pre-STACK (Mpre = 3.10), with the t-statistics indicating no significant difference (p = 
.488) at 95% confidence level. The shift in preference from traditional paper-and-pencil exams to 
online methods was statistically significant with a p-value of .001 and a significant increase in mean 
Statistics (Mpre = 2.42 to Mpost = 2.71). Students generally had a positive confidence towards online 
exams with STACK. The confidence in Online Mathematics Exams saw a slight rise post-STACK (Mpre 
= 3.06 to Mpost = 3.25), although the result is borderline (p = .050, which is equal to a 95% confidence 
level), the study concluded this as a significant improvement based on the increase in the average 
responses. 

Willingness to adapt to new technologies remained positive post-STACK, although there was a 
slight decrease in mean willingness scores from Mpre = 3.67 to Mpost = 3.45, which was also statistically 
significant (p = .011). The overall attitude toward online exams was generally positive, with a mean 
attitude score of Mpre = 3.27, pre-STACK, showing a slight improvement to Mpost = 3.31 post-STACK, 
but statistically insignificant (p = .571). 

Figure 3 presents a comparative analysis of the thematic responses before and after the STACK 
exam, with a visual representation of the number of students’ responses that revolved around that 
theme before and after using STACK in the exam. The major themes derived from the analysis 
encompass concerns regarding STACK's ability to assess critical thinking skills, recognition of the 
advantages of the online exam format, and the prevalence of technical challenges leading to anxiety. 
Additional insights from the identified is a spectrum of student sentiments ranging from fear and 
stress associated with online exams to a lack of fear or stress, varied perceptions of the effectiveness 
in assessing reasoning abilities, preferences for traditional exams, timely feedback delivery and the 
convenience of the assessment method, confidence and efficiency with online exams, positive 
perceptions of online exams as a novel and beneficial innovation for learning and practice, 
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willingness to adapt to innovation, recommendations for question improvement, concerns related to 
access issues such as devices, internet, and space, and positive experiences with a clear exam 
structure. These diverse themes provide a comprehensive overview of student experiences and 
opinions related to online exams with the STACK platform. 
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Figure 3. Comparative Analysis of the Identified Themes from the Responses Before and After the STACK 
Exam 


On average (before and after the STACK exam), one salient concern highlighted by students revolved 
around doubts regarding how effectively the STACK content used assessed thinking process, with 
twice the number of females (18.82% of the 85 females in the study) than males (9.68% of th 31 
males in the study) raising this point. The numbers equalised Post-exam with a total of 16.47% 
females and 16.13% males mentioning it. This was despite females performing better than males in 
the exam. This apprehension had in it diverse opinions, with some expressing the need for the 
materials to be reviewed by the lecturer before confirming their grades, while others outrightly 
oppose its integration for assessment for the same reason of its content mainly focusing on the final 
answer and not the thinking process. For instance, some students wrote texts showing concerns 
about the inconvenience of online exams compared to traditional pen-and-paper assessments, citing 
challenges in the evaluation process and the rigid format enforced by STACK. 


Student 099 

Original Text - [...se si potesse inserire una parte del procedimento si riuscirebbe a valutare anche 
quello e non solo il risultato inserito] 

Translation - [...if you could somehow take into account the students steps to the result, you 
would be able to evaluate that too and not just the final result] 


Student 042 

Original Text - [Ritengo che Gli esami online siano comunque una valida alternativa a quelli 
cartacei. E se si ripresentasse l'occasione di svolgere un esame con queste modalita, non avrei nulla 
in contrario. Ma continuo a preferire quelli cartacei, che seppur meno "comodi" lasciano maggiore 
liberta di pensiero per quanto riguarda procedimento e svolgimento.] 

Translation - [I believe that online exams are anyways a valid alternative to paper exams. And if 
the opportunity arises again to take an exam in this manner, I would have nothing against it. But 
I still prefer the paper-based ones, which although less 'convenient' leave more freedom of 
thought as far as the solving process concerned]. 
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Student 005 

Original Text - [Rimango dell’opinione che gli esami online siano molto piu scomodi rispetto al 
cartaceo. Credo che con carta e penna sia piu facile valutare il procedimento, mentre su Stack conta 
solo il risultato finale (ad esempio, potrei avere eseguito il procedimento corretto ma aver fatto un 
banale errore di calcolo in questo caso Stack da tutto sbagliato, mentre magari nel cartaceo sarebbe 
stato valutato positivamente almeno il procedimento). Inoltre, l'’esame online mi obbliga a usare 
una forma rigida per scrivere i risultati, anche se a volte questa forma non é sicuramente la piu 
comoda (nel primo esercizio avrei potuto scrivere la probabilita del punto d come rapporto fra 
coefficienti binomiali, invece ho dovuto per forza scriverla come frazione ridotta ai minimi termini, 
cosa che a mio parere é€ un po’ inutile).] 

Translation - [I remain of the opinion that online examinations are much more inconvenient than 
on paper. I believe that with pen and paper, it is easier to evaluate the solution process, whereas 
on Stack only the final result is accounted for (e.g. I might have carried out the correct procedure 
but have made a trivial calculation error, in which case Stack gives everything wrong, whereas 
perhaps in paper at least the procedure would have been evaluated positively). In addition, the 
online exam forces me to use a rigid form to write down the results, even though sometimes this 
form is certainly not the most convenient (in the first exercise I could have written the probability 
of point d as the ratio of binomial coefficients, instead I had to write it as a fraction reduced to the 
smallest terms, which in my opinion is a bit pointless) ]. 


The second salient theme from the student comments were grouped under the educational 
benefits of using the STACK content (both males and females in equal proportions encouraging it), 
particularly emphasizing its utility for learning through practice with feedback especially 
assessment, willingness to adapt to its integration, and convenience compared to paper assessment 
(evidenced by the response given by student042), having clarity and easy to follow structure, among 
others (all of which were analysed further and presented in the bar chart as well). While some 
students mentioned having anxiety with online exams, other students disagreed with this sentiment, 
stating that they found online exams more relaxing than paper as long as it's the same online platform 
used for assessment. Others even went ahead to give recommendations as pointed out by student099 
previously. The others mentioned the following. 


Student 043 

Original Text - [Sono una buona modalitad per effettuare gli esami da terminale, anche grazie al 
fatto che si ha la possibilita di familiarizzare con STACK prima dell'esame] 

Translation - [They are a good way to carry out final examinations, partly due to the fact that you 
have the opportunity to get accustomed with STACK before the examination]. 


Student 002 

Original Text - [comodo e si ha velocemente il risultato...mi sono trovata molto bene, soprattutto 
per quanto riguarda lo studio. l'esame era chiaro e non necessitava di dimostrare i vari passaggi] 
Translation - [...convenient and you get the result quickly... | found it very good, especially in 
terms of studying. the exam was clear and did not need to demonstrate the various steps] 


Student 060 

Original Text - [Penso che STACK sia utile per fare esercizio, ma viste le difficolta riscontrate con il 
correttore automatico del sito (soprattutto a livello di arrotondamento dei decimali, che spesso 
venivano arrotondati per difetto invece che per eccesso) temo che I'esito dell’esame possa essere 
influenzato da questo genere di errori del sito.] 

Translation - [I think STACK is useful for practice, but given the difficulties encountered with the 
site's automatic corrector (especially in rounding off decimals, which were often rounded down 
instead of up) I fear that the outcome of the exam may be affected by these kinds of errors on the 
site]. 


Student 011 
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Original Text - [Fare esame su STACK non crea ansia e pressione. Essendomi esercitata molto sulla 
piattaforma mi sembra normale fare esercizi tramite schermo] 

Translation - [Doing examination on STACK does not create anxiety and pressure. Having 
practiced a lot on the platform, it seems normal to me to do exercises via the screen]. 


The third prominent theme centred around technical challenges or fear of encountering 
technicalities if STACK is to be used as an online assessment tool, and how this induces anxiety, with 
more males (16.13%) compared to females (11.76%) reported concerns on technical challenges Pre- 
STACK than Post-STACK (7.06% females and 6.45% males). Majority of the male students than 
females responding to this theme, having experienced and feared experiencing technical issue. This 
overarching theme further contributed, in part, to comments related to access to devices, the Internet 
and space for taking exams. 


Student 103 

Original Text - [le soluzioni agli esercizi non erano sempre comprensibili e alcune volte erano errate, 
questo ha aumentato la difficolta nello studio individuale.| 

Translation- [the solutions to the exercises were not always understandable and were sometimes 
wrong, which increased the difficulty in individual study]. 


Student 038 

Original Text - /il fatto di essere costretti a procurarsi un dispositivo autonomamente vincola chi 
deve sostenere l'esame, la possibilita di avere un‘aula gia fornita di computer sarebbe meglio. inoltre 
i problemi riscontrati con la piattaforma STACK prima dell'esame, per quanto sistemati, lasciano 
un senso di inquietudine nello svolgimento dell'esame] 

Translation - [the fact of having to bring over a device independently constraints those who have 
to take the examination, the possibility of having a classroom already equipped with a computer 
would be better. Moreover, the problems encountered with the STACK platform before the 
examination, however fixed, leave a sense of unease in the conduct of the examination] 


Further analysis revealed that participant responses highlighting on the issues of technical 
challenges and anxiety decreased in prevalence after interacting with the STACK Exam. However, 
this theme retained significance across both pre and post-STACK responses. Analyzing the 
qualitative responses proved to be a complex task, especially when comparing the quantitative 
analysis before and after the STACK exam, with the researchers primarily concentrating on the major 
themes outlined earlier. While the minor themes held less prominence, their inclusion alongside the 
major ones was deemed valuable for capturing the diverse range of feedback. 


Discussion and Interpretation of Findings 

The objective of this pilot study was to examine the attitudes of undergraduate students at the 
University of Trieste (UniTS) regarding the use of STACK content for assessment in the final exams. 
The first students’ concern addresses the capability of the STACK content employed to capture the 
full thinking process of the students at the exam, with more females (18.82%) compared to males 
(9.68%) expressing concerns on the material’s ability to assess the thinking process Pre-STACK 
exam. The numbers equalised post-exam (16.47% females and 16.13% male). This was despite 
females outperforming males in the exam. We remark that this feedback is specific of the STACK 
content employed and not of STACK as a system. It is definitely possible to write STACK questions 
that take faithfully into account the thinking processes all the way through the exercise, and a few 
employed questions already do that by segmenting the task into smaller steps each of which is 
assessed iteratively (so making a mistake in one step does not provoke a negative evaluation in all 
subsequent ones). This was studied for instance particularly for proof-type questions, according to 
the works of Bickerton and Sangwin (2022). It simply takes a large number of resources (especially 
in terms of funding) to develop these advanced questions taking the full thinking process into account 
(Nakamura et al., 2012). This is an investment that we plan to make in the near future, ideally in the 
form of contracting professional developers to write STACK content full-time for the University. 
Another way institutions go about this is to exchange high-quality questions through open question 
banks according to Nakamura et al., (2014), which has worked well to some extent, although there is 
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no organ in charge of certification of questions quality. It is recognised in the community that STACK 
courses need a few iterations to be at their best (the STACK developer Chris Sangwin stated ‘a good 
course takes three years’) encouraging institutions which just started incorporating this technology 
within their education system. This pilot study was necessary precisely to understand these 
underlying areas of improvement. We conclude the comments around this feedback on the thinking 
process explicitly raised the point motivated by a worry of being penalised in the evaluation, as they 
would normally get points distributed for sketching the thinking process, even if a final answer is not 
reached. 

The second prominent theme was student's appreciation of STACK content in enhancing 
learning. More males (25.81%) compared to females (11.76%) appreciated STACK’s potential in 
enhancement of learning, citing various benefits Pre-exam, such as its capacity to elevate the learning 
experiences, facilitate effective practice, deliver immediate feedback, simulate exam scenarios for a 
thorough preparation, enhance student focus, and enable remote and interactive practice in 
formative assessments. Notably, these benefits are particularly advantageous for individuals residing 
at a distance or working part-time, thereby enhancing overall access to education. This has been 
studied especially in relation to students who thrive in a self-paced learning environment (Bishop et 
al., 2022; King, 2023; Sangwin 2015). 

This probably suggests why the results on students’ comfort with online exams through STACK 
showed an increase although insignificant while on the other hand, there was a significant shift in 
preference towards online assessment with STACK, see table 1. This shift in preference aligns with 
the broader educational landscape's trend towards innovative assessment approaches, especially in 
the wake of challenges posed by the COVID-19 pandemic. The significant advantages associated with 
online assessments, as reported in the literature (Butler-Henderson & Crawford, 2020; Raman et al., 
2021; Tai et al., 2022), such as enhanced accessibility, convenience, and the ability to provide prompt 
feedback, likely contributed to the students’ growing comfort and preference for online assessment 
tools like STACK. As education transitions into an era characterized by technological integration, this 
shift in preferences underscores the importance of adapting pedagogical practices to meet the 
evolving needs of students. The positive trend suggests a potential receptivity to digital assessment 
tools, paving the way for further exploration of innovative technologies in educational settings. 
While the statistical analysis revealed a slight rise in confidence, the result is borderline with a p- 
value of .050, indicating significance at the 95% confidence level. Despite this borderline significance, 
the study concluded it as a notable improvement based on the observed increase in average 
responses. This increase in confidence can be attributed to the structured nature of STACK 
assessments as pointed out by some students, and its ability to encourage others to practice with 
feedback in formative assessment as well as reduce exam anxiety in the process. 

Some indicators may not have changed post-STACK due to factors such as prior experience or 
familiarity with online assessment tools (Hachey et al., 2022). Students who were already 
comfortable with similar platforms might not have seen significant shifts in their attitudes or 
preferences. Additionally, the perceived relevance of certain assessment methods could play a role 
(Jamil, 2012). For example, if students already felt confident in their ability to perform well using 
traditional assessment formats, they might not have perceived a need for change or improvement. 
Furthermore, individual differences in learning styles and preferences could contribute to variations 
in post-STACK indicators (Serutla et al., 2024). Students with diverse backgrounds and experiences 
may respond differently to the introduction of new assessment methods, leading to a range of 
responses in the post-STACK data. 

The third feedback expressed concerns about possible technical issues that may arise during 
online exam, with more males (16.13%) compared to females (11.76%) reported concerns on 
technical challenges Pre-exam than post-exam (7.06% females and 6.45% males). In fact, almost no 
issues arose during the online exam sessions run so far, and noticeably the number of concerned 
students dropped considerably from pre- to post-exam. The decline in concerns about technical 
issues from pre-exam to post-exam may be attributed to increased familiarity with the platform, 
aided by support resources and training sessions, alongside potential technical improvements made 
during the exam period according to works of Mahlangu and Makwasha (2023), and (Almaiah et al., 
2022) . The technical problems that occurred involved a single student running out of battery (which 
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was then charged in the classroom), a single student not able to access to Moodle (which was then 
solved by trying a number of times), and a single student that borrowed a device from a colleague. 

The identified insights from this pilot study hold implications for pedagogy in STEM education, 
particularly in the context of the University of Trieste (UniTS), of the Italian education system, and of 
other institutions considering the integration of tech-driven assessment tools. The observed trends 
in student perceptions, encompassing increased comfort with online exams, a notable shift in 
preference toward online assessment, and a positive willingness to adapt to technologies like STACK, 
suggest a receptivity to digital tools among STEM students. 

Connecting these insights to the broader implications for STEM education, addressing student 
concerns and leveraging the advantages offered by technology can inform and shape pedagogical 
practices (Baleni, 2015). The increase in confidence, even if marginally significant, highlights the 
potential of structured online assessments, such as those facilitated by STACK, in fostering a more 
positive learning environment (Beliauskene & Yanuschik, 2021; Erskine & Mestel, 2018; Juma, 2023; 
Oyengo et al., 2021). The immediate feedback mechanism not only enhances the learning process but 
also aligns with the principles of formative assessment according to Gagne's (1985) theory of 
learning, enabling students to practice with feedback and, in turn, reducing exam anxiety. However, 
the study also unveils certain challenges, particularly regarding the perceived lack of rigor in testing 
critical thinking skills. This discrepancy emphasizes the need for ongoing refinement and 
improvement in technology-driven assessments. Acknowledging the advantages and challenges 
identified in the study, educators can tailor their pedagogical approaches to address these nuances, 
ensuring a balanced integration of technology into STEM education (Facer, 2011; Singh, 2021). 

The study's variations in perceptions, diverse attitudes toward online assessment, and 
distinctions in learning benefits contribute to a richer understanding of the complex nature of 
student attitudes towards STACK. Moreover, the empirical evidence presented in this study 
contributes to the ongoing discourse on the advantages and challenges of STACK Assessment in 
STEM disciplines. This contextualized contribution is particularly relevant to the Italian academic 
landscape, where institutions like UniTS are new to the implementation of STACK. The findings offer 
valuable insights that can guide institutions undergoing a similar transition to online assessment 
with the STACK system, contributing to the enhancement of STEM education practices in Italy and 
beyond. 


Limitations 

The following issues were identified as potential limitations in the study. 

1. Non-Response Bias: The study recognizes the possibility of non-response bias in qualitative data 
analysis. Acknowledging that these cases may not have captured the sentiments of the entire 
student body, this bias could result from the content or satisfied individuals being less inclined 
to provide feedback or vice versa, potentially skewing qualitative findings towards more critical 
viewpoints. 

2. Selection Bias of First Trial Students with Online Exams in STACK: Involvement of first trial 
students in the study raises awareness of potential selection bias as well. While their experiences 
are valuable, it's acknowledged that their responses might not be fully representative of the 
broader student population. First-trial students are generally the best ones and could possess 
other distinct characteristics or motivations that influence their perceptions, limiting the 
generalizability of findings to students with more experience. These initial characteristics of first- 
trial students were not captured before the research started. 

3. Fatigue and Post-Exam Disposition: The study recognizes the influence of factors like fatigue and 
post-exam disposition on students’ responses. Acknowledging the possibility that students, eager 
to leave after the exam, might provide hurried or less thoughtful feedback, it's understood that 
these factors could impact the depth and accuracy of qualitative insights. 

4. Novelty Effect: The study underscores the potential of the novelty effect associated with the first 
online exam experience and, for some, the first university exam ever. While this novelty adds 
richness to the data, it’s acknowledged that it could introduce biased perceptions, as students 
may lack a basis for comparison. 

5. The interplay between sex and technology acceptance is intricate and subject to ongoing debate 
as mentioned prior in the literature review. Recognizing the sex imbalance within the pilot course 
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(73.5% females, 26.5% males) as a limitation, more piloting will be done in other courses with 
various student enrolments. 


Suggestions 

Moving ahead, the following areas were identified as points for further investigation: 

1. Conducting a longitudinal study to track students’ perceptions and attitudes towards online 
examinations with the STACK system over an extended period, allowing for a deeper 
understanding of how these attitudes evolve with continued exposure and experience. 

2. Comparing the attitudes and performance of students using the STACK system for online 
examinations with those using traditional pen-and-paper assessments, exploring potential 
differences in outcomes, preferences, and experiences. Supplementing quantitative findings with 
qualitative data to gain a richer understanding of students’ experiences, attitudes, and concerns 
regarding online examinations with the STACK platform. 

3. Investigating the effectiveness of interventions aimed at addressing specific concerns raised by 
students, such as enhancing the ability of STACK questions to assess critical thinking skills or 
providing additional support for technical issues, and assessing their impact on student attitudes, 
performance, and satisfaction to inform strategies for optimizing the use of the STACK system in 
educational settings. 


CONCLUSION 


In evaluating student responses to the STACK system, several benefits and areas for 
improvement have been identified. The flexibility in the evaluation tree structure offers both 
challenges and opportunities, such as the use of a tolerance node to reduce stress in exercises 
involving descriptive statistics. Student feedback highlights the need for STACK to better consider 
the thinking process, which can be addressed by segmenting exercises into multiple steps to track 
reasoning. The study found sex-based differences in responses, with more females concerned about 
the system's ability to assess thinking processes, though these concerns equalized after the exam. 
Despite females outperforming males, more males appreciated STACK's benefits, such as immediate 
feedback and convenience. Logistical challenges, like device availability, have been mitigated by 
allowing tablets, though further solutions, such as computer lab rotations, are needed. Minimal 
technical problems were reported, and a session to familiarize students with STACK syntax is 
proposed to alleviate stress. The randomization of exercise parameters also helps deter cheating, 
making STACK a valuable tool for enhancing STEM education. 
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